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[57] ABSTRACT 

A speech signal is analyzed by applying the signal to 
formant filters which derive first, second and third sig- 
nals respectively representing the frequency of the 
speech waveform in the first, second and third for- 
mants. A first pulse train having approximately a pulse 
rate representing the average frequency of the first 
formant is derived; second and third pulse trains having 
pulse rates respectively representing zero crossings of 
the second and third formants are derived. The first 
formant pulse train is derived by establishing N signal 
level bands, where N is an integer at least equal to two. 
Adjacent ones of the signal bands have common bound- 
aries, each of which is a predetermined percentage of 
the peak level of a complete cycle of the speech wave- 
form. A first level of the first pulse train is derived while 
the first formant signal has an amplitude lying in even 
numbered ones of the bands; a second level is derived 
while the first formant signal has an amplitude lying in 
odd number ones of the band. The pulse trains repre- 
senting the first and third formant signals are normal- 
ized relative to the second formant pulse train. Normal- 
ization is attained in each instance by counting the num- 
ber of pulses in the first and third pulse trains over the 
interval required for the pulses in the second train to 
reach a predetermined number. The resulting normal- 
ized pulse trains are supplied to a memory to identify a 
phoneme in the speech signal or are transmitted as nar- 
row band width signals. 

28 Claims, 10 Drawing Figures 
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SPEECH ANALYZER 
ORIGIN OF THE INVENTION 

The invention described herein was made by an em- 5 
ployee of the United States Government and may be 
manufactured and used by or for the Government for 
govermentai purposes without the payment of any roy- 
alties thereon or therefor. 

FIELD OF THE INVENTION 10 

The present invention relates generally to speech 
analyzers and more particularly to a speech analyzer 
wherein signals representing the frequency content of a 
pair of formants are compared with each other. In ac- 15 
cordance with another aspect of the invention, a speech 
quantizer derives a bilevel signal having first and sec- 
ond levels while the speech signal has amplitudes re- 
spectively lying in even and odd numbered signal level 
amplitude bands, where adjacent bands have common 20 
boundaries. 

BACKGROUND OF THE INVENTION 

Devices to analyze speech waveforms have applica- 
tion to assist the deaf and for narrow band width com- 25 
munications. For both applications, each speech utter- 
ance, i.e., phoneme, is coded into a different signal, 
whereby each phoneme has a unique relationship to the 
coded signal. To assist the deaf, the unique phoneme to 
signal relationship is utilized to activate an indicator, 30 
usually visual, that the deaf can perceive. For narrow 
band width communication systems the speech signal is 
transformed into phoneme indicating signals having a 
band width that is less than approximately 100 bits per 
second. 35 

Prior art speech analyzers have generally fallen into 
one of three categories, each of which appears to have 
certain deficiencies. One of the most commonly em- 
ployed prior art devices has used detectors for deter- 
mining when a speech waveform crosses a predeter- 40 
mined amplitude, typically the average, or zero, value 
of the waveform. Devices of this nature are often re- 
ferred to as zero crossing detectors since they derive 
pulse outputs in response to the waveform crossing the 
zero value. Typically, the number of pulses derived 45 
over a predetermined time interval provides an indica- 
tion of the frequency of the speech waveform. Zero 
crossing detectors have a tendency to respond only to a 
frequency component having the highest amplitude, 
particularly when one frequency component has an 50 
amplitude that is much higher than any of the other 
frequency components. For the first formant (typically 
270-730 Hertz), where there is appreciable, important 
information in frequency components having lower 
amplitudes than a peak component, this tendency may 55 
result in serious loss of information. If two or more 
frequencies have approximately the same amplitude, the 
zero crossing detector has a tendency to capture either 
the highest frequency or the lowest frequency in the 
waveform, depending upon adjustments made to the 60 
zero crossing detector. By responding or capturing the 
highest or lowest frequency the prior art devices have 
not been well suited to provide accurate information for 
speakers having widely differing glottal or fundamental 
frequencies, as exist between men, woman and children. 65 

Another type of prior art speech analyzer has em- 
ployed relatively complex apparatus for analyzing the 
speech spectrum in raw form. Such analyzers typically 
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employ a bank of many parallel bandpass filters respon- 
sive to a speech source. Each filter supplies energy in a 
relatively narrow pass band to an associated amplitude 
detector and the detectors drive relatively complex 
processing circuitry. It has been found that such analy- 
zers, in addition to being relatively complex, suffer from 
the deficiency of providing excessive information. The 
amount of information derived is often so great tht 
difficulties arise in coding the resultant information into 
an indication of the uttered phoneme. A further defi- 
ciency in spectrum analyzers is that they do not con- 
sider phase information of the different components that 
form a phoneme. Instead, there is derived a d.c. signal 
indicative of the phoneme amplitude. 

The third type of proposed speech analyzer is capable 
of learning the characteristics of different speakers. 
Such systems, however, must generally be programmed 
for each individual speaker and are not usually adapted 
to analyze the speech of a wide variety of speakers 
whose speech patterns have not been programmed into 
a memory of the analyzer. 

BRIEF DESCRIPTION OF THE INVENTION 

In accordance with one aspect of the present inven- 
tion, an improved speech quantizer provides informa- 
tion regarding the amplitude, frequency and phase of a 
speech waveform, and in particular the first formant. 
The improved quantizer derives a bilevel, i.e., signal 
having first and second levels while the speech wave- 
form has amplitudes respectively lying in even and odd 
numbered amplitude bands; there are thereby at least 
two, and preferably more than two, amplitude bands. 
The bands have common boundaries, each of which is a 
predetermined percentage of the peak level of one com- 
plete cycle of the speech waveform. Since each of the 
boundaries is a predetermined percentage of the peak 
level of one cycle of the waveform, the speech wave- 
form amplitude is normalized. Establishing the bounda- 
ries is attained in a relatively simple manner by supply- 
ing the speech waveform to an automatic gain control 
(AGC) amplifier, which derives an output that is ap- 
plied to a number of amplitude detectors. In response to 
alternate ones of the amplitude detectors being acti- 
vated alternate triggering levels are supplied to a 
Schmitt trigger which derives the bilevel signals. 

While there are prior art systems wherein the average 
frequency of a speech wave is computed to determine 
the centroid of the output of a formant filter, as dis- 
closed in U.S Pat. Nos. 3,078,345 and 2,857,465 to Cam- 
panella and Schroeder, the prior art systems employ 
relatively complex computer circuitry that is not easily 
implemented. In addition, Campanella requires a plural- 
ity of narrow bandpass filters to determine discrete 
frequency components in each formant. 

In accordance with a further aspect of the invention, 
the speech signal is divided into first, second and third 
formants and the frequency content indication for the 
second formant normalizes the signals indicative of the 
frequency content of the third and/or first formants. By 
normalizing the first and third formant frequencies rela- 
tive to the second formant frequency it is possible to 
accurately analyze the speech of difference speakers. 
For any particular phoneme, the fundamental fre- 
quency of a woman or child is generally shifted up- 
wardly by approximately 10% relative to that of a man. 
This usually causes a 10% displacement of the speech 
content of the different types of speakers over each of 
the three formants. By normalizing the frequencies of 
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the first and third formants relative to the second for- 
mant frequency, particularly by taking the ratio for the 
first to second formants and third to second formants, 
compensation is provided for the shift in fundamental 
frequency of different speakers. Signals indicative of the 5 
frequency contents of the two normalized formants can 
be applied to a two dimensional memory matrix, to 
indicate the original uttered phoneme. The matrix can 
be in direct proximity to the speech source, for deaf 
applications, or at the end of a transmission link for 10 
narrow band width transmission applications. 

The signal indicative of the first formant frequency 
content is derived by utilizing the quantizer that derives 
the bilevel signal indicative of frequency, amplitude and 
phase, as described supra. For the second and third 15 
formants, where the frequency information is dominant 
over the amplitude information, a zero crossing detec- 
tor may be employed to provide the frequency informa- 
tion. It is necessary to employ the frequency, amplitude 
and phase quantizer for the second and third formants 20 
because the tendency for different frequencies to be 
clustered in close proximity to each other does not exist 
in these formants to the samd extent as in the first for- 
mant. Hence, for the second and third formants the zero 
crossing detector has a frequency error, in absolute 25 
terms, that is considerably less than the error for the 
first formant. 

In accordance with another aspect of the invention, 
normalization of the frequency content of the first and 
second formants and the third and second formants is 30 
provided with a relatively simple apparatus utilizing 
first and second counters respectively responsive to 
pulse trains representing the frequency content of the 
first and third formants and a predetermined counter 
responsive to a pulse train representing the frequency of 35 
the second formant. In response to the predetermined 
count being reached, the contents of the two counters 
responsive to the first and third formant pulse trains are 
frozen and ultimately read out. 

It is accordingly an object of the present invention to 40 
provide a new and improved speech analyzer. 

Another object of the invention is to provide a speech 
analyzer employing a quantizer that responds to the 
relative amplitude of different frequencies of a speech 
signal. 45 

A further object of the invention is to provide a new 
and improved speech analyzer wherein an accurate 
indication of phoneme utterance is provided for speak- 
ers having widely varying speech characteristics. 

An additional object of the invention is to provide a 50 
speech analyzer wherein the frequency content of one 
formant is normalized against another formant. 

Still another object of the invention is to provide a 
new and improved digital speech analyzer that is rela- 
tively simple and yet provides accurate information of 55 
the relative amplitude and frequencies which compose 
phoneme and other sources used in communications; 
exemplary of the other sounds are sirens; whistles, tele- 
phone rings and door knocks. 

Still another object of the invention is to provide an 60 
apparatus for phoneme distinction or segmentation and 
conversion to digital output code by using a digital table 
look-up scheme. 

The above and still further objects, features and ad- 
vantages of the present invention will become apparent 65 
upon consideration of the following detailed description 
of one specific embodiment thereof, espeically when 
taken in conjunction with the accompanying drawing. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is an overall block diagram of a preferred 
invention; 

FIG. 2 is a circuit diagram of one embodiment of the 
quantizer of FIG. 1; and 

FIGS. 3A-6B are waveforms useful in describing the 
quantizer of FIG. 2. 

DETAILED DESCRIPTION OF THE DRAWING 

Reference is now made to FIG. 1 of the drawing 
wherein there is illustrated, in block diagram form, one 
embodiment of a speech analyzer in accordance with 
the present invention. A speech signal to be analyzed is 
derived from a suitable source, e.g., microphone 11, 
which feeds automatic gain control (AGC) amplifier 12. 
Amplifier 12 derives an output signal such that the 
speech waveform has approximately the same peak 
amplitude over each complete cycle, i.e., phoneme, 
whereby a speech waveform having a normalized maxi- 
mum amplitude is derived from the amplifier. 

The normalized output signal of amplifier 12 is ap- 
plied in parallel to processing circuitry including band- 
pass, formant filters 13, 14 and 15 respectively having 
pass bands for the first, second and third formants. Fil- 
ters 13, 14 and 15 have pass bands (F h F 2 and F 3 ) (band- 
passes between minus 3db points) as follows: 

358 Hertz ^ Fj ^ 742 Hertz 

1074 Hertz ^ F 2 ^ 2226 Hertz 

1790 Hertz ^ F 3 ^ 3710 Hertz 
While filters 13-15 have pass bands in the stated fre- 
quency range, the skirts of these filters are not very 
steep so that appreciable energy in frequencies outside 
of the pass bands thereof is derived. Thereby, the output 
of filter 13 includes frequency components in excess of 
its high frequency cutoff of 742 Hertz and filter 14 
passes frequencies lower than its low frequency cutoff 
of 1074 Hertz. Active equalization filter 16 is connected 
between the output of amplifier 12 and the inputs of 
each of filters 13-15 to increase the amplitude of the 
high frequencies relative to the low frequencies and 
provide a uniform amplitude versus frequency charac- 
teristic for the analyzer. 

Formant filters 13, 14 and 15 derive analog output 
signals that are respectively applied to frequency analy- 
zers 17, 18 and 19. Frequency analyzers 18 and 19, for 
the second and third formants, are conventional zero 
crossing detectors that derive a pulse each time a posi- 
tive going portion of the waveform applied to them 
goes through zero by using adjustable amplitude 
Schmitt trigger and/or threshold circuits. Hence, detec- 
tors 18 and 19 respectively derive pulse trains having 
pulse rates proportional to the frequencies of the second 
and third formants. 

Frequency analyzer 17, responsive to the first for- 
mant signal derived from filter 13, however, provides 
an indication of the relative amplitude and phase of the 
different frequencies applied to it. The number of pulses 
derived by quantizer 17 appears to be related to the 
average frequency of the signal applied to it; the term 
“average frequency” is related to the frequency and 
relative amplitudes of the different components sup- 
plied to quantiziers 17. For example, if the input to 
quantizer 17 is represented by: 

+ diCOSiaiit+fyl) 
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( 1 ) 


5 


4,039,754 


where: 

g>!, ft> 2 , . . . o)„are 2i r times the frequency components 
(f\> h • • • //i) applied to the quantizer, 

A\ t A} ... A„ are respectively the amplitudes of the 5 
components coj, co 2 , . . . g) w , and 

4> 2 * * • - are the phases of a>i<t> 2 , . . . <o „ , the 
average frequency, / of the input of quantizer 17 is 
reproduced as: 

10 

nf — AJ' x +A 1 f 1 +. . . + AJ„ ( 2 ) 

The output pulses of quantizer 17 and zero crossing 
detector 19, representing the frequency contents of the 
first and third formants, are normalized against the J5 
number of output pulses of detector 18, representing the 
frequency content of the second formant. Normaliza- 
tion is performed by taking the approximate ratio of the 
number of pulses derived from quantiizer 17 and detec- 
tor 18 over a predetermined time interval and the num- 
ber of pulses derived from detectors 19 and 18 over the 1 
same time interval. In particular; the frequency contents 
of the first and third formants are normalized relative to 
the second formant by detecting the number of output 
pulses of quantizer 17 and detector 19 over the time 
interval required for detector 18 to reach a predeter- 
mined count. The normalized counts are periodically 
read out to an analyzer apparatus. If the number of 
pulses derived from detector 18 fails to reach the prede- 
termined count within a predetermined time interval, 
referred to as a sampling interval and equal to the inter- 
val between adjacent read out operations to the analy- 
zer, the counts from quantizer 17 and detector 19 during 
the sampling interval are read out to the analyzer. In 
one embodiment, the sampling interval is 60 millisec- 
onds and the predetermined count eqauls 64 . 35 

To these ends, the output signals of quantizer 17 and 
zero crossing detector 19 are respectively applied 
through inhibit gates 23 and 24 to eight bit counters 21 
and 22. The output of zero crossing detector 18 is ap- 
plied through inhibit gate 30 to predetermined counter 40 
25 which derives a binary one level in response to a 
predetermined number of pulses, such as 64 , being ap- 
plied to it since the last time it was reset; prior to 
counter 25 reaching a count of 64 , a binary zero is de- 
rived from the output thereof. The binary one output of 45 
counter 25 is applied to the inhibit inputs of gates 23, 24 
and 30, thereby freezing the contents of counters 21, 22 
and 25 until the counters reset Counters 21, 22 and 25 
are periodically reset in response to each output pulse of 
oscillator 26; each output pulse has sufficient length to 50 
enable resetting and read out of counters 21, 22 and 25. 
Typically, output pulses of oscillator 26 are derived 
once every 60 milliseconds and have approximately a 50 
microsecond duration which also inhibit counters 21, 22 
and 25 via OR gate 56 and inhibit gates 23, 30 and 24. 55 
The output pulses of oscillator 26 are applied to the 
inhibit terminals of gates 23, 24 and 30 through OR gate 
56 to positively prevent coupling of pulses from quan- 
tizer 17 and detectors 18 and 19 into counters 21, 22 and 
25 if the inhibit inputs of these gates were not previously 60 
activated by the output of counter 25 during the sam- 
pling interval being considered. 

Delay networks 27 and 28, cascaded to the output of 
oscillator 26, respond to the leading edge of the pulse 
output of oscillator 26 to provide suitable delays (each 65 
typically 20 microseconds) for enabling the contents of 
counters 21 and 22 to be read out after the inputs to the 
counters have been previously inhibited by the output 
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of oscillator 26 and to reset the counters after the con- 
tents thereof have been read out. 

In the described embodiment the counters 21 or 22 
are physically in the same integrated circuit package as 
the registers 31 or 32. The output of the delay circuit 27, 
the read pulse, is actually applied to the respective reg- 
isters to transfer the contents of the counters to the 
related registers, the information being transferred in a 
parallel operation. After completion of this operation, 
the output of the delay network 28 is applied as a reset 
input to each of the counters 21, 22 and 25. In terms of 
actual timing of the sequence of events, after the read 
pulse has terminated, registers 31 and 32 store signals 
indicative of the counts of counters 21 and 22 upon the 
completion of a 60 millisecond sampling interval. 
Thereafter, counters 21, 22 and 25 are reset to zero by 
the output of delay network 28 and a new counting 
interval is subsequently initiated when the trailing edge 
of the output pulse of oscillator 26 occurs to remove the 
inhibit inputs via OR gate 56 from gates 23, 24 and 30. 

The signals stored in registers 31 and 32 effectively 
represent the frequency content of the first formant 
relative to the second formant and the third formant 
relative to the second formant, respectively. Investiga- 
tions I have conducted have led me to believe that 
frequencies of the different formants vary relatively 
uniformly for the same phoneme for different speakers. 
My studies further indicate that the approximately 40 
different phonemes which constitute speech can be 
recognized for different speakers by comparing the 
frequencies of the first and third formants normalized 
against the second formant frequency. 

To identify the different phonemes, the signals stored 
in registers 31 and 32 are applied to a read only memory 
34 including voiced phoneme memory matrix 35 and 
unvoiced phoneme memory matrix 36. Memory mat- 
rices 35 and 36 are driven in parallel by the output 
signals of registers 31 and 32, as well as binary signals 
indicating whether a particular phoneme is a voiced or 
unvoiced utterance. The indication of a voiced or un- 
voiced utterance is derived by applying the output of 
amplifier 12 to low pass filter 41, which derives an 
output signal that is applied to a conventional voiced- 
/unvoiced detector 42. In response to a voiced pho- 
neme being detected, a binary signal is supplied by de- 
tector 42 to memory matrix 35, causing that matrix to be 
activated in response to the output signals of registers 31 
and 32. In response to an unvoiced phoneme being 
detected, detector 42 derives a binary one signal that is 
applied to matrix 36 via a delayed network 43. The 
delay of network 43 is equal to the time interval of a 
silent interval known to occur at the beginning of un- 
voiced phonemes. 

Matrices 35 and 36 respond to the output signals of 
registers 31 and 32 to locate a phoneme that is uniquely 
defined by the output signals of the registers. Asso- 
ciated with each memory location in matrices 35 and 36 
is an indicator. Thereby, the different phonemes are 
indicated in response to the indicator in memory mat- 
rices 35 and 36 being activated. 

The indicators in matrices 35 and 36 can be in the 
form of lamps, for assistance to the deaf. In the alterna- 
tive, the indicators can be utilized in a speech synthe- 
sizer to activate elements which cause utterances to be 
derived. In the latter instance, the device can be utilized 
in a narrow band width communication system. 
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The 60 millisecond sampling interval of the system 
described in connection with FIG. 1 is approximately 
one-half the length of a phoneme, according to gener- 
ally accepted theory. Thereby, in response to a pho- 
neme utterance, two successive, identical signals are 5 
usually derived from registers 31 and 32. For more 
positive phoneme identification, as well as for synthesiz- 
ing applications, the outputs of memory 35 and 36 for 
successive sampling intervals can be determined and, if 
they are the same, indicated as a phoneme. 10 

The device can also be utilized as a laboratory speech 
analyzing apparatus, in which case the output signals of 
registers 31 and 32, indicative of the normalized first 
and third formant frequencies, cause orthogonal X and 
Y deflection of a cathode ray beam included in cathode 15 
ray tube 51 that comprises an X-Y display. To these 
ends, the output signals of registers 31 and 32 are re- 
spectively applied to digital-to-analog converters 52 
and 53 that drive the X and Y deflection electrodes 54 
and 55 of cathode ray tube 51. 20 

To derive a signal indicative of the average frequency 
of the first formant, quantizer 17 derives a bilevel signal 
having a first level while the first formant signal has an 
amplitude passing through even numbered ones of a 
plurality of signal level bands. In response to the output 25 
signal of filter 13 lying in the alternate, off numbered 
ones of the bands, quantizer 17 derives a second level of 
the binary signal. Adjacent ones of the signals level 
bands have common boundaries so that there is a transi- 
tion in the output of quantizer 17 as the amplitude of the 30 
quantizer input has a transition from one band into an- 
other. Since AGC amplifier 12 is provided, each of the 
boundaries of the different bands is a predetermined 
percentage of the peak level of a complete cycle of the 
speech waveform. Thereby, the speech waveform ap- 35 
plied to quantizer 17 is effectively normalized. Better 
normalization can, perhaps, be attained by connecting 
an AGC amplifier between the output of filter 13 and 
the input of quantizer 17. 

Quantizer 17, according to one embodiment, includes 40 
circuitry as illustrated in FIG. 2. The output of formant 
filter 13 is applied to voltage divider 62, having taps 
63-66, by capacitor 61, which establishes an average, 
zero value for the AC undulations of the formant filter 
output; the capacitor drives operational amplifier 101 45 
including a feedback circuit comprising resistor 102 and 
smoothing capacitor 103. the positions of taps 63-66 are 
selected in accordance with the predetermined percent- 
ages at which it is desired to establish boundaries for the 
different bands. 50 

The voltages developed at taps 63-66 are respectively 
compared by analog comparators 71-74 with a d.c. 
reference voltage at terminal 67. Comparators 71-74 are 
arranged so that in response to the inputs thereof from 
taps 63-66 being greater than the positive d.c. voltage at 55 
terminal 67, positive d.c. voltages representing a binary 
one are derived therefrom. In response to the voltage 
applied to comparators 71-74 taps 63-66 being less than 
the voltage at terminal 67, comparators 71-74 derive 
output voltages of zero magnitude, to represent a binary 60 
zero. The output signals derived from comparators 
71-74 drive logic network 104 that includes cascaded 
inhibit gates 105-107, the last of which may drive an 
optional voltage level detector, such as Schmitt trigger 
108. Inhibit gates 105-107 are connected with each 65 
other and the outputs of comparators 71-74 so that 
alternating zero and one levels are derived from gate 
107 as the output of amplifier 101 passes through differ- 
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ent boundary levels indicated in FIG. 3A, as deter- 
mined by the position of taps 63-66. 

In operation, when the output of amplifier 101 is zero 
or less, indicated by the line 110 (FIG. 3A), none of 
comparators 71-74 derives a binary one output, so that 
a binary zero is derived from each of gates 105-107, as 
well as from trigger 108. In response to the output of 
amplifier 101 being between the levels indicated by 
boundaries 110 and 111, the voltage at tap 63 exceeds 
the voltage at terminal 67 but the voltages at taps 64-66 
are less than that at terminal 67 so comparator 71 de- 
rives a binary one to the exclusion of the other compara- 
tors. The binary one output of comparator 71 is coupled 
through gate 107 to Schmitt trigger 108, causing a bi- 
nary one to be derived from the quantizer output. In 
response to the output of amplifier 101 increasing fur- 
ther so that it lies between levels 111 and 112, compara- 
tors 71 and 72 derive binary one levels while compara- 
tors 73 and 74 derive binary zero levels. The binary one 
output of comparator 72 causes a binary one to be de- 
rived from gate 106, which inhibits gate 107 so that a 
low level is applied to Schmitt trigger 108, causing a 
binary zero to be derived from the quantizer. Similarly, 
when the output of amplifier 101 is between levels 112 
and 113, comparators 71-73 derive binary one levels, to 
the exclusion of comparator 74; the binary one output of 
comparator 73 is passed through gate 105 to inhibit gate 
106 so that the binary one output of comparator 71 is 
coupled to the input of Schmitt trigger 108, causing a 
binary one to be derived again from the quantizer. In 
response to the output of amplifier 101 being greater 
than level 113, each of comparators 71-74 derives a 
binary one level so that gate 105 is inhibited and gate 
106 derives a binary one level that inhibits gate 107, 
causing a binary zero to be applied to and derived from 
Schmitt trigger 108. If necessary, to provide for more 
positive control of gates 105-107, feedback resistors 
121-124 are optionally connected between the outputs 
of comparators 71-74 and the input of amplifier 101. 

The manner in which the present invention provides 
a number of pulses commensurate with the average 
frequency of the signal derived from formant filter 13 is 
best illustrated by reference to the waveforms of FIGS. 
3-6. The waveforms of FIGS. 3-6 all have the same 
maximum amplitude to provide a normalized situation. 
Output signals of formant filter 13 for four different 
phonemes are illustrated by FIGS. 3A-6A and the re- 
sultant binary signal derived from quantizer 17 for these 
phonemes are illustrated by FIGS. 3B-6B. FIG. 3A 
represents a wave that is a fundamental of a sinusoid; 
FIG. 4A represents the second harmonic of the sin- 
susoid, phase shifted —90° (at the second harmonic 
frequency); FIG. 5A represents one-half of the funda- 
mental plus one-half the amplitude of the phase shifted 
second harmonic; and FIG. 6A represents one-third the 
amplitude of the fundamental plus two-thirds the ampli- 
tude of the phase shifted second harmonic. Mathemati- 
cally, the waveforms of FIGS. 3A-6A are represented 
as: % 

/>(/) = sin ait (3) 

MO = -cos 2 -t (4) 

A{t) — \ sin at — J cos 2 ait (5) 

At) = i sin a>t — | cos 2 ait (6) 

The quantizer of FIG. 2 derives the pulse trains of 
FIGS. 3B-6B in response to potentiometer taps 63-66 
being set so that comparators 71-74 derive positive, 
predetermined voltages in response to the voltages at 
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taps 63-66 being 10%, 30%, 50% and 70% of the peak 
amplitude of the voltage applied to potentiometer 62 
over a complete cycle of the speech waveform. The 
number of negative going transitions in each of the 
binary waveforms of FIGS. 3B-6B is different to elimi- 5 
nate ambiguity of identification of the waveforms; for 
the waveform of FIGS. 3B there are four transitions, 
for FIG. 4B there are eight transitions, for FIG. 5 there 
are five transitions, and for FIG. 6 there are seven tran- 
sitions. In contrast, there is one positive going, zero 10 
crossing for FIGS. 3A and 5A and two positive going, 
zero crossings for the waveforms of FIGS. 4A and 6A. 
The transitions of FIGS. 3B-6B are sensed by counter 
21 , causing the counter to be advanced by each of them. 

In certain instances, it is possible that there may be 15 
ambiguity in the output of quantizer 17. In other words, 
two materially different speech waveforms applied to 
quantizer 17 may produce the same number of transi- 
tions. Such ambiguity can be virtually eliminated if the 
negative, as well as positive, portion of the speech 20 
waveform is detected by the quantizer. Both the posi- 
tive and negative portions of the speech waveform may 
be analyzed by the quantizer, by feeding the output of 
capacitor 61 to a full wave rectifier that derives voltage 
divider 62 or by providing a complementary compari- 25 
son network responsive to an inverted replica of the 
output of amplifier 101 , and by providing additional 
gates in logic network 104 that are responsive to the 
signals developed by the complementary comparison 
network and are cascaded with gates 105-107. 

While there has been described and illustrated one 
specific embodiment of the invention, it will be clear 
that variations in the details of the embodiment specifi- 
cally illustrated and described may be made without 
departing from the true spirit and scope of the invention 35 
as defined in the appended claims. 

What is claimed is: 

1. Apparatus for quantizing a speech waveform and- 
/or a waveform that is a replica of an audio signal, such 

as a telephone ring, a knock or a siren, comprising 40 
means for establishing N signal level bands, where N is 
an interger more than two, adjacent ones of said bands 
having common boundaries, each of said boundaries 
being a predetermined percentage of the peak level of a 
complete cycle of the waveform, means responsive to 45 
the established bands for deriving a bilevel signal hav- 
ing a first level while the speech signal has an amplitude 
lying in even numbered ones of said bands and a second 
level while the speech signal has an amplitude lying in 
odd numbered ones of said bands. 

2. The apparatus of claim 1 wherein the means for 
establishing the N signal level bands includes means for 
normalizing the peak amplitude of a phoneme of the 
speech waveform and a plurality of amplitude compara- 
tors, one of said comparators being provided for each 55 
boundary, each of said comparators being responsive to 
the normalized waveform and a predetermined ampli- 
tude level to derive an output signal having a first level 

in response to the normalized waveform having an 
amplitude less than the predetermined percentage for 60 
the boundary associated with the comparator. 

3. Apparatus for analyzing a speech waveform and/or 
a waveform that is a replica of an audio signal, such as 
a telephone ring, a knock or a siren, comprising formant 
filter means responsive to the waveform for deriving 65 
first, second and third signals respectively representing 
the frequency content of the speech waveform in first, 
second and third formants, and means responsive to the 
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first, second and third signals for separately normalizing 
the first and third signals relative to the second signal. 

4. The apparatus of claim 3 further including means 
responsive to the normalized first and third signals for 
deriving an indication of a phoneme in the speech wave- 
form. 

5. The apparatus of claim 4 wherein the indication 
deriving means includes a memory having first and 
second inputs responsive to the normalized first and 
third signals, respectively. 

6. The apparatus of claim 5 wherein the memory 
comprises a digital table look-up. 

7. The apparatus of claim 5 further including a voi- 
ced/unvoiced detector responsive to the speech wave- 
form and means for controlling the memory in response 
to voiced and unvoiced indications derived from the 
voiced/unvoiced detector. 

8. The apparatus of claim 4 wherein the indicator 
means includes a display having first and second or- 
thogonal axes, means for respectively controlling the 
display along said first and second axes in response to 
the normalized first and third signals. 

9. The apparatus of claim 4 wherein the means for 
deriving the first signal includes means for quantizing an 
analog signal indicative of the first formant into a first 
pulse train, said means for quantizing including means 
for establishing N signal level bands, where N is an 
integer at least equal to two, adjacent ones of said bands 
having common boundaries, each of said boundaries 

30 being a predetermined precentage of the peak level of a 
complete cycle of a speech waveform, means respon- 
sive to the established bands for deriving a bilevel signal 
having a first level while the speech signal has an ampli- 
tude lying in even numbered ones of said bands and a 
second level while the speech signal has an amplitude 
lying in odd numbered ones of said bands. 

10. The apparatus of claim 9 further including means 
for respectively quantizing analog signals indicative of 
the second and third formants into second and third 
pulse trains, and said means for normalizing includes 
means for counting the number of pulses in the first and 
third pulse trains over the interval required for the 
pulses in the second pulse train to reach a predeter- 
mined number. 

11. The apparatus of claim 10 wherein the means for 
quantizing the analog signals indicative of the second 
and third formants includes a zero crossing detector 
responsive to the analog signals indicative of the second 
and third formants. 

50 12. The apparatus of claim 4 further including means 

for respectively quantizing analog signals indicative of 
the first, second and third formants into first, second 
and third pulse trains, and said means for normalizing 
includes means for counting the number of pulses in the 
first, second and third pulse trains over the interval 
required for the pulses in the second pulse train to reach 
a predetermined number. 

13. The apparatus of claim 4 further including means 
for periodically supplying the normalized signals to the 
means for indicating. 

14. The apparatus of claim 3 wherein the means for 
deriving the first signal includes means for quantizing an 
analog signal indicative of the first formant into a first 
pulse train, said means for quantizing including: means 
for establishing N signal level bands, where N is an 
integer at least equal to two, adjacent ones of said bands 
having common boundaries, each of said boundaries 
being a predetermined percentage of the peak level of a 
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complete cycle of a speech waveform, means respon- 
sive to the established bands for deriving a bilevel signal 
having a first level while the speech signal has an ampli- 
tude lying in even numbered ones of said bands and a 
second level while the speech signal has an amplitude 
lying in odd numbered ones of said bands. 

15. The apparatus of claim 14 further including means 
for respectively quantizing analog signals indicative of 
the second and third formants into second and third 


bands, where N is an integer at last equal to two, adja- 
cent ones of said bands having common boundaries, 
each of said boundaries being a predetermined percent- 
age of the peak level of a complete cycle of the wave- 
5 form, means responsive to the established bands for 
deriving a bilevel signal having a first level while the 
speech signal has an amplitude lying in even numbered 
ones of said bands and a second level while the speech 
signal has an amplitude lying in odd numbered ones of 


pulse trains, and said means for normalizing includes 
means for counting the number of pulses in the first and 
third pulse trains over the interval required for the 
pulses in the second pulse train to reach a predeter- 
mined number. 


10 said bands. 

22. Apparatus for analyzing a speech waveform and- 
/or a waveform that is a replica of an audio signal, such 
as a telephone ring, a knock or a siren comprising means 
responsive to the waveform for deriving first and sec- 


16. The apparatus of claim 15 wherein the means for 15 
quantizing the analog signals indicative of the second 
and third formants includes a zero crossing detector 
responsive to the analog signals indicative of the second 
and third formants. 


ond pulse trains respectively indicative of the frequency 
of the waveform in first and second formants, and 
means for counting the number of pulses in the first 
pulse train over the interval required for the pulses in 
the second train to reach a predetermined number. 


17. The apparatus of claim 3 further including means 20 23. The apparatus of claim 22 wherein the means for 

for respectively quantizing analog signals indicative of deriving one of the pulse trains includes means for es- 

the first, second and third formants into first, second tablishing N signal level bands, where N is an integer at 

and third pulse trains, and said means for normalizing least equal to two, adjacent ones of said bands having 

includes means for counting the number of pulses in the common boundaries, each of said boundaries being a 

first, second and third pulse trains over the interval 25 predetermined percentage of the peak level of a corn- 
required for the pulse in the second pulse train to reach plete cycle of the waveform, means responsive to the 

a predetermined number. established bands for deriving a bilevel signal having a 

18. Apparatus for analyzing a speech waveform and- first level while the speech signal has an amplitude lying 

/or a waveform that is a replica of an audio signal, such in even numbered ones of said bands and a second level 

as a telephone ring, a knock or a siren, comprising for- 30 while the speech signal has an ampitude lying in odd 
mant filter means responsive to the waveform for deriv- numbered ones of said bands. 

ing a pair of signals respectively representing the fre- 24. Apparatus for analyzing a speech waveform and- 
quency content of the speech waveform in a pair of /or a waveform that is a replica of an audio signal, such 

formants, and means responsive to the pair of signals for as a telephone ring, a knock or s siren, comprising for- 

comparing the signals representing the speech in the 35 mant filter means responsive to the waveform for deriv- 
pair of formants. ing first, second and third signals respectively repre- 

19. The apparatus of claim 18 wherein the means for senting the frequency content of the speech waveform 

deriving the pair of signals includes means for quantiz- in first, second and third formants, means responsive to 

ing the waveform into first and second pulse trains the first, second and third signals for normalizing the 

having pulse rates indicative of the frequency contents 40 first signal relative to the second signal, and means 
in the formants, and means for counting the number of responsive to the normalized first signal and a function 

pulses in the first pulse train over the interval required of the third signal for deriving an indication of a pho- 

for the pulses in the second train to reach a predeter- neme in a speech waveform. 

mined number. 25. The apparatus of claim 24 wherein the indication 

20. The apparatus of claim 19 wherein the means for 45 deriving means includes a memory having first and 

quantizing the waveform into the first pulse train in- second inputs responsive to the normalized first signal 

eludes means for establishing N signal level bands, and the function of the third signal. 

where N is an integer at least equal to two, adjacent 26. The apparatus of claim 25 wherein the memory 

ones of said bands having common boundaries, each of comprises a digital table look-up. 

said boundaries being a predetermind percentage of the 50 27. The apparatus of claim 25 further including a 

peak level of a complete cycle of the waveform, means voiced/unvoiced detector responsive to the speech 

responsive to the established bands for deriving a bi- waveform and means for controlling the memory in 

level signal having a first level while the speech signal response to voiced and unvoiced indications derived 

has an amplitude lying in even numbered ones of said from the voiced/unvoiced detector. 

bands and a second level while the speech signal has an 55 28. The apparatus of claim 24 wherein the indicator 

amplitude lying in odd numbered ones of said bands. means includes a display having first and second or- 

21. The apparatus of claim 18 wherein the means for thogonal axes, means for respectively controlling the 

deriving one of the signals includes means for quantiz- display along said first and second axes in response to 
ing the waveform into a first pulse train, said quantizing the normalized first and third signals. 

means including means for establishing N signal level 60 * * * * * 
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